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ABSTRACT The quantitative description 
of model protein folding kinetics using a dif- 
fusive collective reaction coordinate is exam- 
ined. Direct folding kinetics, diffusional coeffi- 
cients and free energy profiles are determined 
from Monte Carlo simulations of a 27-mer, 3 
letter code lattice model, virhich corresponds 
roughly to a small helical protein. Analytic 
folding calculations, using simple diffusive rate 
theory, agree extremely well with the full sim- 
ulation results. Folding in this system is best 
seen as a diffusive, funnel-like process. 

I Introduction 

Protein folding is a collective self-organization pro- 
cess, conventionally described as a chemical reaction. 
However, this process generally does not occur by an 
obligate series of discrete intermediates, a "pathway," 
but by a multiplicity of routes down a folding fun- 
nel. Dynamics within a folding funnel involves the 
progressive formation of an ensemble of partially or- 
dered structures, which is transiently impeded in its 
flow toward the folded structure by trapping in local 
minima on the energy landscape. As one proceeds 
down the folding funnel, the different ensembles of 
partially ordered structures can be conveniently de- 
scribed by one or more collective reaction coordinates 
or order parameters. Thermodynamically this funnel 
is characterized by a free energy that is a function of 
the reaction coordinate which is determined by the 
competition between the energy and entropy. A cru- 
cial feature of the funnel description is the concerted 
change in both the energy and the entropy as one 
moves along the reaction coordinate. As the entropy 



decreases so does the energy. The gradient of the free 
energy determines the average drift up or down the 
funnel. Superimposed on this drift is a stochastic mo- 
tion whose statistics depends on the jumps between 
local minima. To first approximation this process 
can be described as diffusion. Folding rates are de- 
termined both by the free energy profile of the fun- 
nel and the stochastic dynamics of the reaction co- 
ordinates. In this paper we study the dynamics of 
a reaction coordinate describing the folding funnel 
of a lattice model with thermodynamic parameters 
which theoretical arguments suggest realistically cor- 
respond with fast folding small helical proteins.'^ We 
determine from simulation both the free energy pro- 
files and effective configurational diffusion coefficients 
for a folding reaction coordinate as a function of tem- 
perature. Folding times, also determined by simula- 
tions, are compared to analytical calculations based 
on these quantities. 

The kinetic analysis of a single folding funnel is 
most appropriate for the fast folding proteins that 
mainly fiow downhill progressively in energy to the 
folded state. During folding, even for this case, trap- 
ping in local minima will occur for times very short 
compared the overall folding time. Alternatively, 
when the residence time in these traps becomes too 
long leading to a substantial slow down of the folding 
time, the traps can be thought of as creating addi- 
tional funnels. This occurs near the glass transition of 
the heteropolymer. Good folding sequences will fold 
at a temperature Tf, which is above the glass tran- 
sition temperature, Tg, and have a single dominant 
folding funnel. As the chain moves downhill energeti- 
cally in its dominant funnel and becomes more similar 
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to the native structure , the configurational entropy 
of the chain (number of available states) is reduced. 
For the model discussed here, a single order param- 
eter' Huffic,c;s to mc!asure similarity. For real proteins 
more order parameters may be necessary to quantify 
the similarity of the configuration to the native struc- 
ture for example: degree of collapse, helicity as well 
as fraction of correct contacts and dihedral angles 
in the general case. A free energy surface as a func- 
tion of these order parameters can then be calculated. 
For the case of a single dominant funnel, these order 
parameters may also be associated with reaction co- 
ordinates. The motion of these reaction coordinates 
will be largely diffusive due to the transient trapping. 
For the single funnel case, the folding time is gener- 
ally determined both by the difficulty to overcome the 
free energy barrier and a prefactor that depends on 
the ruggedness of the landscape that enters via the 
diffusion coefficients. This general quantitative de- 
scription of folding using diffusive coordinates along 
with energy landscape ideas to account for trapping 
was introduced by Bryngelson and Wolynes*''' (BW). 
A short overview of their ideas is given in the next 
section. Many of the qualitative ideas from that de- 
scription (e.g. the importance of the relationship be- 
tween Tf and Tg for kinetics) have been qualitatively 
confirmed in lattice studies of model proteins^'^' and 
others have studied the dynamical properties of these 
model systems.^' ^^^^^ 'j'j^g main goal of the present 
paper is to quantitatively compare the BW expres- 
sions for the folding time and effective diffusion co- 
efficients with the simulation results over a range of 
thermodynamics conditions for a model that theory 
suggests realistically corresponds with the faster fold- 
ing small proteins. 



In a previous paper, we started this analysis. 
There it was shown that at the folding temperature 
{Tf) the trajectories of appropriate collective reaction 
coordinates are Brownian and must surmount a mod- 
est thermodynamic free energy barrier. The folding 
time at Tf was well described by the diffusive rate 
formula. This work shows how, using a combination 
of simulation results and analytical theories, to for- 
mulate a law of corresponding states relating simple 
lattice models to small helical proteins. Since free 
energy surfaces are temperature dependent, various 
scenarios for the folding mechanism apply at differ- 
ent temperatures.^ By testing the validity of analyti- 
cal formulas over a wide temperature range, we hope 
to provide a route to use theoretical descriptions to 
design and understand quantitative experiments. 



II A summary of the Bryngelson and 
Wolynes Energy Landscape Picture 

The energy landscape of any heteropolymer is com- 
plex. Because of the possibility of making many con- 
tacts involving residues that are distant in sequence, 
the energy landscape is rough. This is an effect of 
frustration. Protein-like heteropolymers obey a prin- 
ciple of minimal frustration, leading to an additional 
funnel-like aspect of the landscape. To quantify this, 
we recognize that the energy landscape is stratified, 
that is to say the statistical characteristics of the 
landscape depend on the distance from, or equiva- 
lently the similarity to, the ideal native structure. 
This similarity measiire is known as an order param- 
eter in the physics of phase transitions, and for small 
systems such as proteins can also be used as a reac- 
tion coordinate for computing the folding rate. (Here 
this the reaction coordinate will be called n.) Clearly 
such a reaction coordinate is not unique. A picture of 
such a stratified landscape with kinetic connections is 
shown in figure 1. BW show that to a first approx- 
imation, the folding time can be computed by first 
grouping together states with in a stratum having a 
common value of the reaction coordinate. A diffu- 
sion equation for the probability flow between strata 
is derived under the assumption that the reaction co- 
ordinate can only change by relatively small steps, 
and it is written as 



dP{n,t) 



dP{n,t) 
dn 



+ P{n,t) 



d (3F{n) 



dn 



(1) 



The average direction of the flow is given by the gradi- 
ent of the free energy. The diffusion coefficient, D{n), 
depends on the trapping in local minima and reflects 
the ruggedness of the energy landscape in the system 
proximity of the glass transition. 

The free energy F{n) incorporates the balance be- 
tween two terms, the energy that is decreased as 
the native state is approached and the entropic term 
—TS{n) which decreases with unfolding. The shape 
of the free energy proflle is therefore strongly temper- 
ature dependent as illustrated in figure lb. At high 
temperatures, folding is an uphill process thermo- 
dynamically so folding is exponentially suppressed. 
At the folding temperature, the free energy profile is 
typically bistable with a small thermodynamic bar- 
rier which arises from the incomplete cancellation of 
the entropy by the energy as the systems descends 
through the funnel. At low temperatures, folding be- 
comes a downhill process. Thus, if the diffusion co- 
efficient were temperature independent, the rate of 
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Figure 1: The top figure shows the l<inetic landscape versus 
an arbitrary reaction coordinate (n). The vertical line shows 
the glass transition point (ng) in which the behavior changes 
from many paths to few paths. In addition to the native state 
there are a number of local minimum. The bottom figure 
shows the free energy plotted as a function of the same 
coordinate. The line at nt denotes the transition region. 



folding would have a dependence on the thermody- 
namic driving force (which depends on 1/T) just as 
a solid state rectifier depends on the applied voltage 
(as in the famous example of Feynman). The folding 
time can be written as a double integral 



/■"fold 

= dn 

J "unt J 



' exp[/3F(n)-/3F(nO] 

D{n) ^ ^ 



When the barrier exists, as at T/, F{n) has a 
double-well with the bottom of one well close to riunf 
and the bottom of the other well at nfoia- In this sit- 
uation this double integral can be approximated by a 
Kramer's like law,^^'^'' 



27r\ exp{/?[F(nt)-F(n„nf)]} 

J Do Wunl Wfold 



where 



X{n) = log 



"L»(n)" 



(5) 



and Wunf is the curvature around riunf and Wfoia is the 
curvature in the top of the barrier. Do is the effective 
diffusion coefficient on an equivalent flat landscape. 

The prefactor of Kramer's expression reflects the 
multiple recrossings of the barrier through diffusion 
that is controlled by trapping. When the process is 
entirely downhill, the double integral becomes sim- 
ply proportional to the diffusion coefficient, and only 
weakly depends on the slope of the free energy gra- 
dient. 

The configurational diffusion coefficient depends 
both on the local moves allowed to the protein and 
the energy landscape topography. BW analyzes con- 
figurational diffusion by assuming the limit of an un- 
correlated rugged energy landscape and Metropolis 
rules. This is, of course, a caricature of the dynam- 
ics in which the landscape will have correlations and 
the barriers to escape from traps may be surmounted 
by successively melting out local clusters rather than 
globally changing the protein conformation. Accord- 
ing to the BW analysis, the diffusion coefficient has a 
strong temperature dependence that arises from the 
necessity to escape from local minima on the rough 
energy landscape. At high temperatures, D follows a 
Ferry law typical of glasses. 



D{T,n) = Do cxp[-/3^AE^(n)] 



(6) 



where AE'^{n) is the local mean square fluctuation in 
energy. At intermediate temperatures, the strongly 
non-Arrhcnius temperature dependence becomes it- 
self moderated 

D{T,n) = 

Doexp{-5*(n) + [l3g{n)-P]^AE\n)} (7) 

This equation is valid for temperatures between Tg (n) 
and 2Tg{n). This form arises because of the assumed 
Metropolis dynamics. The characteristic tempera- 
ture is the ideal glass transition where the configu- 
rational entropy would vanish at equilibrium. This is 
given by the condition 



Tg{n) = 



AE{n) 
[25* (n)] 1/2' 



(8) 



F{n) = F{n) - TX{n) 



where S*{n) is configurational entropy at n. We see 
that at Tg, the diffusion coefficient is diminished from 
its bare value by a factor of the total number config- 
uration states, thus giving rise to a possibility of a 
(4) Levinthal paradox. Below Tg, we expect both that 



4 



10' 




10-= I I 

0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 
P 



Figure 2: Mean first passage time (t/) is plotted as a func- 
tion of the inverse temperature (/?). Note, parabolic depen- 
dence of the folding time on f3. 

the diffusion coefficient will still possess an activa- 
tion energy but the activation energy will not be self- 
averaging and that the free energy itself can fluctuate 
considerably and will be very sensitive to details of 
the model and to the sequence of the heteropolymer. 
In the theory of disordered systems this is known 
as the emergence of non self-averaging behavior, the 
hallmark of the ideal thermodynamic spin glass tran- 
sition in mesoscopic systems. Below Tg, the slow fold- 
ing process can be better described by a few kinetic 
pathways than by the statistical average laws appro- 
priate for the faster events. Both the temperature 
dependence of D and the interplay of entropy and 
energetic ruggedness in the driving force lead to a 
parabolic dependence of the folding rate on the in- 
verse temperature as shown in figure 2. This rough 
form has been confirmed in many lattice simulations. 
Leite and Onuchic""^^ have shown that for an energy 
landscape model for solvent polarization, analogous 
to the BW landscape, there is a gradual transition 
into a slow non-self averaging dynamics. Fluctuations 
gradually dominates the kinetics below this temper- 
ature. 

Folding rates depends on both diffusion and free 
energy profile in a way that is significantly different 
from the standard transition state theory (TST) The 
extent to which the rate is determined by the free en- 
ergy profile versus the diffusion coefficient, that incor- 
porates the multiple barrier crossing, depends on the 
choice for the reaction coordinate. Notice however, 
that while it may be impossible or at least difficult 
to find a reaction coordinate for which TST is exact, 
the diffusion formula (eq. 2) is essentially invariant 
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Figure 3: The native state (minimum energy) cube 
for the sequence studied in this worl<. The se- 
quence (ABABBBCBACBABABACACBACAACAB) consist of three 
monomer types. 

to this choice as long as the elementary moves are 
reasonably local for this coordinate. 

In the original BW treatment, this reaction coor- 
dinate was represented by an order parameter p that 
measured the similarity between any given configu- 
ration and the native state of the protein in terms of 
the fraction of correct dihedral angles, a coordinate 
which is manifestly local because of the elementary 
moves of the protein at the microscopic level are con- 
trolled by isomerization of the peptide bond. Another 
possible reaction coordinate is the number of correct 
contacts, a coordinate that is only partially local. 

Ill Configurational diffusion for lattice 
models in proteins 

Are these general ideas developed by Bryngelson 
and Wolynes (BW) valid for quantitatively predict- 
ing folding times in model proteins with a realistic 
energy landscape topography? We show in this sec- 
tion that as long as the glass transition falls after the 
transition region (top of the barrier in the free energy 
profile for the collective reaction coordinate) that this 
is the case. In this limit the single dominant funnel 
picture is appropriate. The system under study is the 
designed three-letter code 27-mer lattice model (see 
figure 3) used in our recent studies.^' Although 
this 27-mer is far from being a real protein, it has 
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Figure 4: The energy and order parameter (Q, the number 
of correct native contacts) plotted as a function of time for a 
sample folding simmulation. The temperture is the folding 
temperature (T/ = 1.51). The time is roughly 30 times 
the folding time (tj' ~ 3 x 10''). The plot shows the two- 
state-like behavior of this system with transition between the 
native state {E — —84, Q — 28) and a molten globule region 
[E ~ 50, Q ~— 7). The transition are extremly rapid with 
respect to the folding time. In addition there are significant 
fluctations about the two states. 

been shown'^ that by developing a law of correspond- 
ing states, the results obtained for such a system can 
be used to describe a small a-helical protein. In these 
simulations the units of temperature and energy are 
such that kb — 1. This still leaves an arbitrary scale 
factor since the only important quantity is the ratio of 
energy to temperature. We have chosen small, integer 
values for the coupling energies for convenience and 
efhciency. The kinetic glass temperature is Tg ^ 1. 

Figure 4 shows the time evolution at Tf for en- 
ergy and the Q order parameter. This Q parameter 
is a measure of similarity to the native state and is 
equal to the number of correct contacts (i.e. con- 
tacts that exist in the native state). It ranges from 
0, no correct contacts, to 28, the maximum number 
of possible contacts. Figure 5 plots the energy, en- 
tropy and free energy as a function of Q for various 
temperatures. At temperatures above the glass tem- 
perature, there is a significant gradient in both energy 
and entropy. These profiles are computed from the 
density of states obtained using the Monte Carlo his- 
togram technique. Starting from a random configu- 
ration, collapse occurs at times very short compared 
to the folding time. Thus in this parameter range, 
the radius of gyration need not be considered as a 
separate dynamical reaction coordinate; however, in 
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Figure 6: A folding trajectory (Q(t)) superimposed on the 
free energy {F[Q]) at the folding temperature. The trajec- 
tory is approximately 9x 10 ' Monte Carlo steps long (approx- 
imately 27% of Tf). The right axis shows the time counted 
backwards from the folded state. Note, most of the time 
is spent diffusing in the molten globule region. Once the 
barrier is surmounted folding proceeds rapidly. Numerous 
recrossings of the barrier region indicate the need to use a 
diffusive theory for this reaction coordinate note transition 
state theory. 

determining the free energies one must note that the 
mean radius of gyration does vary with temperature. 
A molten globule band, where configurations have an 
average of 20 contacts and Q — 7.5 (w 27% similar- 
ity to the native state), describes the region where 
this sequence spends most of its time on its way into 
the folded state. The shape of the free energy in the 
vicinity of this mean Q is quasi- harmonic, although 
in fact large deviations are possible at high tempera- 
tures. As shown in our earlier work, since at Tf the 
local glass transition as a function of Q occurs after 
the transition region has been overcome, the folding 
time is determined primarily by the time taken to 
overcome the free energy barrier, i.e., to cross the 
transition region. Figure 6 shows a folding trajectory 
of the Q coordinate superimposed on a plot of the free 
energy. The plot shows a time span which is roughly 
one quarter of the mean folding time (r/) at this tem- 
perature. Most of the trajectory consists of diffusive 
motion about the molten globule region. Once the 
barrier has been surmounted folding occurs rapidly, 
taking roug hly 10^ Monte Carlo steps (« .03t/). 

However, to estimate the folding times at a variety 
of temperatures, knowledge of the free energy bar- 
rier alone is not sufficient. Information about the 
dynamics must be obtained by calculating the con- 
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Figure 5: Energy, entropy and free energy as a function of Q for several temperatures calculated using the Monte Carlo 
histogram method. For a broad range of temperatures the free energy has a clear double well form: one at the native state 
{Q = 28) and one at the molten globule region (Q ~ 7). The double well form is consistent with the two state behavior 
scene in figure 4. Above the glass transition temperature {Tg ~ 1) there is a significant energy and entropy gradient 
between the molten globule region and the transition region {Q ~ 15). Both AE and AS are less than zero, so the 
unfavorable reduction in entropy is offset by a loss in energy, leaving a small free energy barrier at the folding temperature 
(T = 1.509). As the temperature approaches the glass temperature the energy and entropy gradients decrease as does the 
free energy barrier. However the folding time diverges due to the increase in the diffusion constant of the system (i.e., the 
roughness of the energy landscape). 
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figurational diffusion coefficient through the complex 
energy landscape. As described in the previous sec- 
tion, when a single reaction coordinate is considered, 
for example Q, tj can be computed using the dou- 
ble integral give by eq. 2. In general the diffusion 
coefficient will depend on Q but, one more simplifi- 
cation is assumed here. Only the average value of D, 
computed for states in the molten globule band, is in- 
ferred from simulations. We do this by computing the 
correlation function of the fluctuations of the reaction 
coordinate AQ{t). Within the quasi-harmonic diffu- 
sive approximation this correlation function should 
decay exponentially at long times. The configura- 
tional diffusion coefficient will be related to the cor- 
responding correlation time and the mean square 
instantaneous value of the reaction coordinate fluc- 
tuations, AQ^{T). (This calculation is constrained 
for values of Q before the transition region. ) At 
a given temperature, the diffusive harmonic model 
gives D = AQ'^{T)/tcoii{T). Since Tcorr < tj, it is a 
quantity that is much easier to obtain from numerical 
simulations for more complex systems such as atom- 
istic simulations of proteins. For experimentalists we 
especially wish to point out this viewpoint and the 
corresponding approximation allow one to use dy- 
namic probes of fluctuations in the molten globule 
state at equilibrium to predict the rate of the non- 
equilibrium folding process. 



In figure 7 we show the simulated Tcorr and dif- 
fusion coefficient for various temperatures above the 
glass transition. The autocorrelation time as been 
calculated in the molten globule region. To do this, 
the correlation function < Q{t)Q{0) > was calculated 
over paths which were confined to the molten globule 
region of the conformation space. For these calcu- 
lations we define any conformation with Q < 17 as 
being in the molten globule region. The decays are 
in fact non-exponential for short times. We have not 
analyzed the non-exponentiality in detail. The in- 
formation would provide the dynamics of individual 
conformation steps and should contain information 
on the distribution of substate lifetimes. We identify 
the long time decay with the effective diffusive har- 
monic value. A more complete model should account 
for this short time decay via a frequency dependent 
diffusion coefficient as proposed by BW. Frequency 
effects may be important for determining the pref- 
actor of barrier crossing as discussed by Grote and 
Hynes for simple chemical reactions. 
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Figure 7: Autocorrelation time (Tcorr) and diffusion coeffi- 
cient (AQ^/rcorr) plotted as a function of the inverse tem- 
perature (3). At low temperature (high /3) the diffusion coef- 
ficient decreases rapidly while the correlation time increases, 
indicating a slow down of the dynamics due to trapping in 
local minimum. 



IV Details of simulations and comparison 
with BW analysis 

To study the dynamics of this model we used the 
Monte Carlo algorithm with local moves, which in- 
clude end moves, corner flips and 90° crankshaft 
moves. The excluded volume constraint was enforced 
by not allowing any multiple occupancy. (Further 
details can be found in a previous work.^°) To calcu- 
late the mean first passage time (rf) as a function of 
temperature the following procedure was used. For 
each temperature many simulations were performed, 
each starting from a different random unfolded initial 
condition. The simulations were run until the chain 
found the folded state or until a maximum number 
of time steps Tmax had elapsed. The mean first pas- 
sage time is simply the average of the folding times 
for the different runs. For an intermediate range of 
temperatures all of the runs would find the folded 
state within Tmax steps. However for very low and 
high temperatures only some fraction would find the 
folded state in the allowed time. For these temper- 
atures the mean calculated is a lower bound to the 
actual mean first passage time. This is because for 
trials that do not fold within Tmax steps, we average 
in 7"max which is less than the actually folding time 
for that run. This is done for practical reasons since 
we have a finite amount of computer time for the sim- 
ulations. The key point is that Tmax is much greater 
than Tf for a broad range of temperatures; which in 
fact, it is. Figure 2 shows the mean first passage time 




Figure 8: Qualitative replica symmetry breaking value 
Y{Q) = [P^iQ)f P^°^ at three temperatures (Tg, Tf 
and 2Tg). Note that although at the global glass tempera- 
ture (Tg) discrete states are apparent even for small degrees 
of nativeness, at the folding temperature the discrete inter- 
mediates are largely native-like. 

simulated for a range of temperatures. 

At low temperatures, tj grows rapidly. This slow 
down is caused by trapping in local minima. It is well 
known that heteropolymers undergo an ideal glass 
transition at low enough temperature. The random 
energy model estimate of this temperature for the 
present system is T^^^^ = AE^/y/2SZ. The mea- 
sured energetic fluctuations Ai^^'s are 51 at Tf and 
22 at T = 1.1 (slightly above the glass transition). 
The respective configurational entropies for the col- 
lapsed states, Sl, are around 20 and 12. Both cases 
provide a T^^'^' - 1.0. This agrees with the critical 
temperature at which replica symmetry breaking sets 
in. This is determined by finding the temperature at 
which Y — J2iPi becomes of order 1 (see figure 8). 
Pi is the Boltzmann occupation of a microstate. Y is 
not the perfect measure of replica symmetry breaking 
since it assumes the basin of attraction of a microstate 
can be identified with the state itself. The corre- 
sponding neglect of correlation between these high Q 
states, however, is able to provide a reasonable esti- 
mate of Tg for a finite mesoscopic system. 

These estimates of Tg are very similar to those of 
the kinetic glass transition which is the temperature 
at which the folding time slows down substantially. 
We define the kinetic glass transition temperature 
(Tg) to be the temperature at which the folding time 
is sufficiently slower than the fastest folding time ob- 
served. There are several ways of choosing this point 
which will give roughly the same answer. We choose 



Figure 9: Autocorrelation function for Q (see eq. 9) plotted 
on a semi-log scale. T = Tf. The correlations have an initial 
rapid decay followed by a slower decay. By fitting a sum of 
exponentials (usually 3 or 4) the autocorrelation time can be 
determined. 

Tg, to be the temperature at which Tf — ^Tmax- For 
this sequence, ~ 1. 

In addition to the kinetic runs to calculate Tf, we 
performed a number of thermodynamic runs to de- 
termine the free energy profiles. Again most of the 
details can be found in a previous work.^-'^ We used 
the Monte Carlo histogram technique which allows us 
to calculate extensive quantities like the free energy. 
The basic idea is to store a histogram as a function 
of Q and E. These histograms measure the probabil- 
ity that a given {Q, E) pair will occur. From these 
we can calculate the density of states n{Q, E) up to 
a multiplicative constant (which can be determined 
since we know the n(l, E'min) = !)■ We can then cal- 
culate any thermodynamic quantity; in particular, we 
determined the free energy as a function of Q {F[Q]) 
for several temperatures. The free energy is shown 
in figure 5. For a broad range of temperatures the 
free energy has a rough double well profile, with one 
minimum at the folded state {Q = 28) and another 
for the molten globule (Q ~ 5-7). The transition 
state varies from = 22 at T = 2 to = 16 at 
the lower temperatures. At first it might appear the 
transition state is at — 25 since this is the highest 
local maximum in the plot at most temperatures.^" 
However, this turns out not to be the correct transi- 
tion point. One must be careful in interpreting free 
energy plots of this type. There are Monte Carlo 
moves that connect states with Q = 23 directly to 
the ground state short circuiting this barrier. In fact 
any barrier that is smaller in width than 5 can also 
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be by passed by certain Monte Carlo moves. Note, 
the correct barriers from 22 to 16 are much broader. 
In fact it is better to think in terms of a transition 
"rc;gion" rather than a unique transition point. The 
way in which we identified this region was to look 
at the time trajectories (figure 4). Since we are in a 
diffusive regime, a trajectory that reaches the transi- 
tion region should have some probability of diffusing 
back down the barrier rather than crossing over. For 
Q-f values between 16-22 we see such an effect. How- 
ever, by the time = 25 is chosen there would be 
virtually no trajectories that cross back without first 
reaching the folded state. We have also previously 
used a 2 dimensional analysis of the trajectories and 
free energy surfaces and again we find the transition 
region to be between 16-22 and not at 25.^ Since we 
know that the transition state is somewhere between 
16 and 22 we define a specific transition point, for 
use in calculating the barrier for the Kramer's rate 
equation, as the value of Q that is a maximum in 
this region. 

In addition to the free energy barrier we need to 
measure the diffusion coefficient in this system in or- 
der to calculate the folding time. We compute the 
diffusion coefficient by measuring the autocorrelation 
time of Q. Specifically we calculate: 

c,(A) = (QWQft + A))-(g(0)- 
(QHt)) - {Qit)? 

Figure 9 shows a semi-log plot of the auto correlation 
function at T = 1.509, the folding temperature. The 
figure is multi-exponential with a short fast decay 
followed by a much longer, slower decay. The final 
drop-ofi^ is mostly due to errors in sampling at very 
long times. We are interested in the long time decay 
and calculated the exponent associated with this by 
fitting a function of several exponentials (usually 3 or 
4) to this plot. We repeated this calculation at several 
temperatures (note that we can not use the histogram 
method here do the large amount of histogram infor- 
mation that would have to be stored). Figure 10 is 
a log-log plot of the autocorrelation function for sev- 
eral temperatures. As the temperature is decreased 
the autocorrelation time increases. Finally the dif- 
fusion coefficient was estimated as = AQ^/Tcorr- 

With the diffusion coefficients and free energy sur- 
faces in hand, it is now possible to test the analytical 
prediction given by eq. 2. Using the discrete form 
of the full double integral (with a constant diffusion 
coefficient): 
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Figure 10: Q Autocorrelation function plotted on a log-log 
scale for several temperatures. As the temperature decreases 
the correlation time increases. 



we obtain the results presented in figure 1 1 . Note the 
sum is not taken to Q = 28 since we do not expect the 
diffusion coefficient to remain constant in that region 
(between Q = 23-28). However, the time it takes to 
go from Q = 23 to the folded state {Q = 28) is small 
compared to Tf (as seen in figure 6) so the error from 
truncating the sum will also be small. Overcoming 
the barrier {Q w 17) is the limiting step. Addition- 
ally, we can also use a simplified form of eq. 2 which 
assumes that the free energy surface is well approx- 
imated by a parabolic potential around the molten 
globule states that extends all the way to the bar- 
rier (eq. 3). If we further assume that D{n) = Dq 
and that the two curvatures are the same then the 
folding time is given by Tf = 2ttTcovv^^^^'^ (which is 
also shown in figure 11). This simple result is useful 
because it depends only on the correlation time in 
the molten globule and the free energy barrier. The 
agreement with the full formula is remarkable. For 
both formulas, the fastest time at the optimal kinetic 
folding temperature is obtained to well within a fac- 
tor of two. Also, an extremely good description of 
the behavior for the slow down anticipated for low 
and high temperatures is obtained. Thus we see the 
analytical theory based on the actual molten globule 
dynamics and funnel free energy profile is not simply 
qualitatively correct but can be used for quantitative 
predictions of the folding time over a wide thermody- 
namic range. 

Up to this point, the values for the diffusion coef- 
ficients have been obtained directly from the simula- 
tion data for the time correlation functions around 
the molten globule. It is interesting now to com- 
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Figure 11: Comparison of mean first passage times from 
IVlonte Carlo simulations with folding times calculated both 
the discrete from of the double integral (eqs. 2 and 10) and 
the simple rate formula 27rrcorre'^^^ which assumes a con- 
stant diffusion coefficient and that the curvature at the top 
and bottom of the free energy are the same. 
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Figure 12: Diffusion coefficient plotted versus (/3g — /3)^ 



pare the obtained results with the existing models 
of configurational diffusion, particularly the BW the- 
ory. Since our results fall in the range between Tg 
and 2Tg, the intermediate temperature formula given 
by eq. 7 would be seem to be the appropriate com- 
parison. Figure 12 shows a plot of the diffusion coeffi- 
cients as a function of {(3g — (3)^. The slope obtained 
from the semi-log plot provides a fitted value for AE"^ 
of ~ 15. This is a little bit smaller than the AiJ^'s 
obtained directly from simulations at this tempera- 
ture range. This is expected because correlations in 
the landscape cause only part of the total energetic 
ruggedness to come into the activation barriers for es- 
caping from traps. Qualitatively agreement between 
the model and simulations suggests that while the 
REM results are a first approximation, the correla- 
tions in the energy landscape are significant and the 
effective fluctuations for escape from a trap involves 
only a fraction of the chain. ^^'^^ We would expect 
these correlation effects to become more important as 
the effective chain length of the protein grows, how- 
ever we should bear in mind that the same qualitative 
behavior still applies. In the temperature range stud- 
ied, we note the data could be equally well fitted by 
an Arrhenius temperature plot (see figure 13). The 
effective activation energy is ~ 10 (~ IksTf). This 
very large number implies that the effective moves 
of Q require substantial organization of the protein 
since the stability gap is only 20kBTf. The simple 
Ferry law fit is also adequate providing a AE'^ ^ 8 




Figure 13: Diffusion coefficient plotted versus /3 and 
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(see eq. 4 and figure 13). 

V Conclusions 

Protein folding is remarkable as a chemical reac- 
tion with a highly non-Arrhenius temperature depen- 
dence. Although for real proteins some of this be- 
havior is doubtless due to the entropic nature of hy- 
drophobic forces, we see here that this unusual tem- 
perature dependence also arises from the competition 
between trapping in misfoldcd states and drift down 
the folding funnel which is itself a competition be- 
tween entropy and energy. These effects can be de- 
scribed through collective coordinates, which make 
quantitative the nature of the funnel. Our results 
establish that a description in terms of a single col- 
lective coordinate suffices to explain folding kinetics 
of small lattice models over a wide range of temper- 
atures. We believe that a similar set of concepts can 
be used for the smaller real proteins because of their 
correspondence with the lattice models. The diffu- 
sive dynamics should be contrasted to the more tra- 
ditional transition state theory. These simulations 
suggest that if transition state theory is used a very 
complex reaction coordinate, reflecting the detailed 
trapping dynamics, would have to be constructed. 
The single collective diffusive coordinate picture on 
the other hand is much more robust and can make 
use of experimental information about the dynamics 
of fluctuations in the molten globule, as well as sim- 
ple thermodynamic measurements and theories about 
the molten globule state. 

The lattice model studied here are described quite 
well by a single order parameter or reaction coordi- 
nate. Various scenarios for protein folding require the 
introduction of a few more collective coordinates to 
describe the funnel. Even the small helical proteins, 
that corresponding in a thermodynamic sense with 
the 27-mer lattice, require the introduction of coordi- 
nates describing the amount of helicity, degree of col- 
lapse and liquid crystallinity of the dynamical molten 
globules. If these order parameters are constant or 
their dynamics is "slaved" to the tertiary ordering 
parameter the one-dimensional diffusive theory for 
funnel dynamics will suffice for real proteins just as 
in the model systems. If the collective diffusion co- 
efficients are wildly different for these extra degrees 
of freedom or when there is a free energy profile for 
the coupled order parameters with several minima, 
one must generalize the Kramer's expression to a few 
more dimensions. We believe such a few-dimensional 
generalization of the current diffusive dynamical de- 
scription should suffice for the smaller fast folding 
protein molecules. On the other hand, sufficiently 



large proteins doubtless require a description in terms 
of geometrically localized order parameters not just a 
globally defined coordinate. One example of geomet- 
rically localized phenomena is provided by foldons.^^ 
Foldons are kinetically autonomous folding subunits, 
that are expected to exist in the larger proteins. Each 
foldon will have its own funnel. A related description 
involving local collective coordinates necessary for de- 
scribing critical nuclei may also be helpful. Some 
arguments suggest critical nuclei may be large^'^ so 
the system would be well treated by global reaction 
coordinate while others suggest nuclei may smalP^ 
or indeed possibly unique.^''' Any of these cases can 
be accommodated by the diffusive funnel dynamics 
picture once the coordinates are specified. We ex- 
pect the diffusive funnel dynamics picture (discussed 
here and previously by us"^) will provide a convenient 
quantitative framework to analyze both simulation 
and laboratory experiments. 
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